36 research outputs found

    MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs

    Full text link
    We present multi-agent A* (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partially-observable Markov decision problems (DEC-POMDPs) with finite horizon. The algorithm is suitable for computing optimal plans for a cooperative group of agents that operate in a stochastic environment such as multirobot coordination, network traffic control, `or distributed resource allocation. Solving such problems efiectively is a major challenge in the area of planning under uncertainty. Our solution is based on a synthesis of classical heuristic search and decentralized control theory. Experimental results show that MAA* has significant advantages. We introduce an anytime variant of MAA* and conclude with a discussion of promising extensions such as an approach to solving infinite horizon problems.Comment: Appears in Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005

    Comparison of Task-Allocation Algorithms in Frontier-Based Multi-Robot Exploration

    Get PDF
    Abstract. In this paper, we address the problem of efficient allocation of the navigational goals in the multi-robot exploration of unknown environment. Goal candidate locations are repeatedly determined during the exploration. Then, the assignment of the candidates to the robots is solved as the task-allocation problem. A more frequent decision-making may improve performance of the exploration, but in a practical deployment of the exploration strategies, the frequency depends on the computational complexity of the task-allocation algorithm and available computational resources. Therefore, we propose an evaluation framework to study exploration strategies independently on the available computational resources and we report a comparison of the selected task-allocation algorithms deployed in multi-robot exploration

    Partage d'autorité dans un essaim de drones auto-organisé

    Get PDF
    National audienceThis paper addresses the human control of a large number of unmanned air vehicles (UAVs) for the surveillance of a sensitive outdoor area. We leverage the combination of sensor network and environment marking for swarm intelligence. This grants autonomy to the UAVs system and allows the operator to focus on noteworthy tasks like counter-intrusion. This paper presents the experimental results of the SMAART project

    Is attention to bounding boxes all you need for pedestrian action prediction?

    Get PDF
    The human driver is no longer the only one concerned with the complexity of the driving scenarios. Autonomous vehicles (AV) are similarly becoming involved in the process. Nowadays, the development of AV in urban places underpins essential safety concerns for vulnerable road users (VRUs) such as pedestrians. Therefore, to make the roads safer, it is critical to classify and predict their future behavior. In this paper, we present a framework based on multiple variations of the Transformer models to reason attentively about the dynamic evolution of the pedestrians' past trajectory and predict its future actions of crossing or not crossing the street. We proved that using only bounding-boxes as input to our model can outperform the previous state-of-the-art models and reach a prediction accuracy of 91% and an F1-score of 0.83 on the PIE dataset up to two seconds ahead in the future. In addition, we introduced a large-size simulated dataset (CP2A) using CARLA for action prediction. Our model has similarly reached high accuracy (91 %) and F1-score (0.91) on this dataset. Interestingly, we showed that pre-training our Transformer model on the simulated dataset and then fine-tuning it on the real dataset can be very effective for the action prediction task

    Etude de differentes combinations de comportments adaptatives

    No full text
    This article focussei on the automated synthesis of agents In an uncertain environment, working In the setting of Reinforcement Learning and more precisely of Partially Observable Markov Decision Processes. The agents (with no model of their environment and no short-term memory) are facing multiple motivations/goals simultaneously, a problem related to thefield of Action Selection. We propose and evaluate various Action Selection architectures. They all combine already known basic behaviors in an adaptive manner, by learning the tuning of the combination, so as to maximize the agent's payoff. The logical continuation of this work is to automate the selection and design of the basic behaviors themselves

    Incremental Reinforcement Learning for designing Multi-Agent Systems

    No full text
    Designing individual agents so that, when put together, they reach a given global goal is not an easy task. One solution to automatically build such large Multi-Agent Systems is to use decentralized learning: each agent learns by itself its own behavior. To that purpose, Reinforcement Learning methods are very attractive as they do not require a solution of the problem to be known before hand. Nevertheless, many hard points need to be solved for such a learning process to be viable. Among others, the credit assignement problem, combinatorial explosion and local perception of the world seem the most crucial and prevent optimal behavior. In this paper, we propose a framework based on a gradual learning of harder and harder tasks until the desired global behavior is reached. The applicability of our paradigm is tested on computer experiments where many agents have to coordinate to reach a global goal. Our results show that incremental learning leads to better performances than more classical techniques. We then discuss several improvements which could lead to even better performances
    corecore